W5: Data Visualization

Data Visualization

Common Plots

Univariate

  • Numeric: histogram

  • Character: bar plots

Bivariate

  • Numeric vs. Numeric: Scatterplot, line plot

  • Numeric vs. Character: Box plot

Why focus on these plots?

Grammar of Graphics

The syntax of the grammar of graphics breaks down into 4 sections.

Data

Mapping to data

Geometry

Additional settings

Think about making plots like using recipes from a cookbook: https://r-graphics.org/

Histogram

ggplot(penguins) + aes(x = bill_length_mm) + geom_histogram()

Histogram with a plot theme

ggplot(penguins) + aes(x = bill_length_mm) + geom_histogram() + theme_bw()

Histogram with options

ggplot(penguins) + aes(x = bill_length_mm) + geom_histogram(binwidth = 5)

Bar plots

Bar plots automatically count each group for you, so you only need to provide one variable (axis).

ggplot(penguins) + aes(x = species) + geom_bar()

Bar plots, providing both axis

Alternatively, if you want to provide both axis for plotting:

penguins_grouped = group_by(penguins, species)
penguins_summary = summarise(penguins_grouped, n_species = n())
penguins_summary
# A tibble: 3 × 2
  species   n_species
  <fct>         <int>
1 Adelie          152
2 Chinstrap        68
3 Gentoo          124
ggplot(penguins_summary) + aes(x = species, y = n_species) + geom_bar(stat = "identity")

Scatterplot

ggplot(penguins) + aes(x = bill_length_mm, y = bill_depth_mm) + geom_point()

Multivaraite Scatterplot by color

ggplot(penguins) + aes(x = bill_length_mm, y = bill_depth_mm, color = species) + geom_point()

Multivaraite Scatterplot by facet

ggplot(penguins) + aes(x = bill_length_mm, y = bill_depth_mm) + geom_point() + facet_wrap(~species)

Boxplot

ggplot(penguins) + aes(x = species, y = bill_depth_mm) + geom_boxplot()

Grouped Boxplot

ggplot(penguins) + aes(x = species, y = bill_depth_mm, color = island) + geom_boxplot()

Some additional options

ggplot(data = penguins) + aes(x = bill_length_mm, y = bill_depth_mm, color = species) + geom_point() + labs(x = “Bill Length”, y = “Bill Depth”, title = “Comparison of penguin bill length and bill depth across species”) + scale_x_continuous(limits = c(30, 60))

Summary of options

data


geom_point: x, y, color, shape

geom_line: x, y, group, color

geom_histogram: x, y, fill

geom_bar: x, fill

geom_boxplot: x, y, fill, color


facet_wrap


labs

scale_x_continuous

scale_y_continuous

scale_x_discrete

scale_y_discrete

esquisse as a helper

Consider the esquisse package to help generate your ggplot code via drag and drop.

library(esquisse)

esquisser(penguins)

R Graphics Cookbook

An excellent resource: https://r-graphics.org/